Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix][Doc] Fix LocalFile doc (#7887) #7890

Closed
wants to merge 6 commits into from
Closed

[Fix][Doc] Fix LocalFile doc (#7887) #7890

wants to merge 6 commits into from

Conversation

YOMO-Lee
Copy link
Contributor

Supplement and optimize the description of the LocalFile connector on filtering files (#7887)

Supplement and optimize the explanation and description of the LocalFile connector regarding file filtering

Supplement and optimize the description of the LocalFile connector on filtering files
[(#7887)](#7887)
@YOMO-Lee YOMO-Lee marked this pull request as draft October 22, 2024 11:50
@YOMO-Lee YOMO-Lee marked this pull request as ready for review October 22, 2024 11:51
@YOMO-Lee YOMO-Lee marked this pull request as draft October 22, 2024 11:51
@YOMO-Lee YOMO-Lee marked this pull request as ready for review October 22, 2024 11:51
Copy link
Member

@Hisoka-X Hisoka-X left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @YOMO-Lee for raise this PR. Please modify all File series doc.

Comment on lines +257 to +261
The filtering format is similar to wildcard matching file names in Linux.

However, it should be noted that unlike Linux wildcard characters, when encountering file suffixes, the middle dot cannot be omitted.

For example, `abc20241022.csv`, the normal Linux wildcard `abc*` is sufficient, but here we need to use `abc*.*` , Pay attention to a point in the middle.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The filtering format is similar to wildcard matching file names in Linux.
However, it should be noted that unlike Linux wildcard characters, when encountering file suffixes, the middle dot cannot be omitted.
For example, `abc20241022.csv`, the normal Linux wildcard `abc*` is sufficient, but here we need to use `abc*.*` , Pay attention to a point in the middle.
Matching rules use regular expression rules.

I think we should tell user what rules we follow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我觉得只说正则,用户容易产生疑问,比如我,用了正则一直不生效,最好有个例子

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can add some demo, but we should tell user what rules we follow at first. Please go ahead and add some demo. For example:

File Structure Example:
/project/documents/report.txt
/project/documents/notes.txt
/project/data/input.csv
/project/data/processed/output.csv
/project/data/archive/old_data.csv
/project/images/logo.png
/project/scripts/script.sh
/project/scripts/utils/helpers.sh

Matching Rules Example:
Example 1: Match all .txt files
Regular Expression:
.*/.*\.txt$

Example 2:


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,我一会完善下

Copy link
Contributor Author

@YOMO-Lee YOMO-Lee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

例子在下面也有说明

YOMO-Lee and others added 2 commits October 25, 2024 15:42
1、When the ClickHouse connector is set to multi parallelism, the task extraction is completed but cannot be stopped normally
[(#7897)](#7897)

2、Added E2E test cases for this issue [(#7897)](#7897)

3、Local developers want to observe **Job Progress Information** in a timely manner,  Need to modify the following configuration.The configuration in config is invalid
```
seatunnel engine/seatunnel-engineer-common/src/main/resources/seatunnely.yaml
```
1、When the ClickHouse connector is set to multi parallelism, the task extraction is completed but cannot be stopped normally
[(#7897)](#7897)

2、Added E2E test cases for this issue [(#7897)](#7897)

3、Local developers want to observe **Job Progress Information** in a timely manner,  Need to modify the following configuration.The configuration in config is invalid
```
seatunnel engine/seatunnel-engineer-common/src/main/resources/seatunnely.yaml
```
1、When the ClickHouse connector is set to multi parallelism, the task extraction is completed but cannot be stopped normally
[(#7897)](#7897)

2、Added E2E test cases for this issue [(#7897)](#7897)

3、Local developers want to observe **Job Progress Information** in a timely manner, Need to modify the following configuration.The configuration in config is invalid
```
seatunnel engine/seatunnel-engineer-common/src/main/resources/seatunnely.yaml
```
@github-actions github-actions bot added the e2e label Oct 26, 2024
@YOMO-Lee YOMO-Lee closed this Oct 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants